Efficient Factorization of Synchronous Context-Free Grammars
نویسندگان
چکیده
Factoring a Synchronous Context-Free Grammar into an equivalent grammar with a smaller number of nonterminals in each rule enables more efficient strategies for synchronous parsing. We present an algorithm for factoring an n-ary SCFG into a k-ary grammar in time O(kn). We also show how to efficiently compute the exact number of k-ary parsable permutations of length n, and discuss asymptotic behavior as n grows. The number of length n permutations that are k-ary parsable approaches a fixed ratio between successive terms as n grows for fixed k. As k grows, the difference between successive ratios approaches 1/e. The University of Rochester Computer Science Department supported this work.
منابع مشابه
Factorization of Synchronous Context-Free Grammars in Linear Time
Factoring a Synchronous Context-Free Grammar into an equivalent grammar with a smaller number of nonterminals in each rule enables synchronous parsing algorithms of lower complexity. The problem can be formalized as searching for the tree-decomposition of a given permutation with the minimal branching factor. In this paper, by modifying the algorithm of Uno and Yagiura (2000) for the closely re...
متن کاملFactoring Synchronous Grammars by Sorting
Synchronous Context-Free Grammars (SCFGs) have been successfully exploited as translation models in machine translation applications. When parsing with an SCFG, computational complexity grows exponentially with the length of the rules, in the worst case. In this paper we examine the problem of factorizing each rule of an input SCFG to a generatively equivalent set of rules, each having the smal...
متن کاملSynchronous Context-Free Tree Grammars
We consider pairs of context-free tree grammars combined through synchronous rewriting. The resulting formalism is at least as powerful as synchronous tree adjoining grammars and linear, nondeleting macro tree transducers, while the parsing complexity remains polynomial. Its power is subsumed by context-free hypergraph grammars. The new formalism has an alternative characterization in terms of ...
متن کاملJoshua 3.0: Syntax-based Machine Translation with the Thrax Grammar Extractor
We present progress on Joshua, an opensource decoder for hierarchical and syntaxbased machine translation. The main focus is describing Thrax, a flexible, open source synchronous context-free grammar extractor. Thrax extracts both hierarchical (Chiang, 2007) and syntax-augmented machine translation (Zollmann and Venugopal, 2006) grammars. It is built on Apache Hadoop for efficient distributed p...
متن کاملEfficient Multi-pass Decoding for Synchronous Context Free Grammars
We take a multi-pass approach to machine translation decoding when using synchronous context-free grammars as the translation model and n-gram language models: the first pass uses a bigram language model, and the resulting parse forest is used in the second pass to guide search with a trigram language model. The trigram pass closes most of the performance gap between a bigram decoder and a much...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006